Deriving Merge Sort Complexity - Educational Content

The key takeaway from the previous slide was establishing that the Combine (Merge) step, which combines two sorted lists of total size $n$, executes in linear time, $O(n)$. Now, we must analyze how many times this linear work is performed across the entire recursive process to derive the total complexity.

Merge Sort's efficiency stems from the structure of its recursion tree. It is a perfect example of a divide-and-conquer algorithm where the input array $A$ of size $n$ is repeatedly split in half.
Step 1: Determining the Number of Levels ($ \log n $ factor)
- Since we repeatedly halve the input size $n$ until we reach the base case (size 1), the number of levels (or the height of the recursion tree) is determined by $k$, where $2^k = n$.
- Solving for $k$ gives us $k = \log_2 n$. Thus, there are $O(\log n)$ levels of recursion.
Step 2: Determining Work per Level ($n$ factor)
- At any level $k$ of the tree, the total number of elements being processed and subsequently merged across all subproblems at that level remains constant: $O(n)$.
- Since the work done at each of the $O(\log n)$ levels is $O(n)$, the total time complexity is the product of these two factors: $O(n \cdot \log n)$.

Complexity Derivation

$T(n) = $ $2T(\frac{n}{2})$ $ + $ $O(n)$

$\text{Work at Level k} = (2^k) \times O(\frac{n}{2^k}) = O(n)$

$T(n) = O(n) \cdot (\text{Number of Levels})$

$T(n) = O(n) \cdot $ $O(\log n)$ $ = $ $O(n \log n)$

Assumptions: The input size $n$ is a power of 2 for simplicity, and the time for the base case $T(1)$ is $O(1)$.